Search Results

Improving the performance of small models with knowledge distillation

Improving the performance of small models with knowledge distillation

MedAI #88: Distilling Step-by-Step! Outperforming LLMs with Smaller Model Sizes | Cheng-Yu Hsieh

MedAI #88: Distilling Step-by-Step! Outperforming LLMs with Smaller Model Sizes | Cheng-Yu Hsieh

Knowledge Distillation with TAs

Knowledge Distillation with TAs

Knowledge Distillation: A Good Teacher is Patient and Consistent

Knowledge Distillation: A Good Teacher is Patient and Consistent

Model Distillation: Same LLM Power but 3240x Smaller

Model Distillation: Same LLM Power but 3240x Smaller

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Qi Wu – Compress language models to effective & resource-saving models with knowledge distillation

Qi Wu – Compress language models to effective & resource-saving models with knowledge distillation

Knowledge Distillation: The story of small language model learning from large teacher models

Knowledge Distillation: The story of small language model learning from large teacher models

Better not Bigger: Distilling LLMs into Specialized Models

Better not Bigger: Distilling LLMs into Specialized Models

Gurtam DevConf 2024: How knowledge distillation inspires foundation models? | Veronika Suprunovich

Gurtam DevConf 2024: How knowledge distillation inspires foundation models? | Veronika Suprunovich

EfficientML.ai Lecture 9 - Knowledge Distillation (MIT 6.5940, Fall 2023)

EfficientML.ai Lecture 9 - Knowledge Distillation (MIT 6.5940, Fall 2023)

[2024 Best AI Paper] Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Samplin

[2024 Best AI Paper] Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Samplin